Search CORE

130 research outputs found

Generating Multi-Categorical Samples with Generative Adversarial Networks

Author: Camino Ramiro
Hammerschmidt Christian
State Radu
Publication venue
Publication date: 01/07/2018
Field of study

We propose a method to train generative adversarial networks on mutivariate feature vectors representing multiple categorical values. In contrast to the continuous domain, where GAN-based methods have delivered considerable results, GANs struggle to perform equally well on discrete data. We propose and compare several architectures based on multiple (Gumbel) softmax output layers taking into account the structure of the data. We evaluate the performance of our architecture on datasets with different sparsity, number of features, ranges of categorical values, and dependencies among the features. Our proposed architecture and method outperforms existing models

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

Human in the Loop: Interactive Passive Automata Learning via Evidence-Driven State-Merging Algorithms

Author: Hammerschmidt Christian A.
State Radu
Verwer Sicco
Publication venue
Publication date: 28/07/2017
Field of study

We present an interactive version of an evidence-driven state-merging (EDSM) algorithm for learning variants of finite state automata. Learning these automata often amounts to recovering or reverse engineering the model generating the data despite noisy, incomplete, or imperfectly sampled data sources rather than optimizing a purely numeric target function. Domain expertise and human knowledge about the target domain can guide this process, and typically is captured in parameter settings. Often, domain expertise is subconscious and not expressed explicitly. Directly interacting with the learning algorithm makes it easier to utilize this knowledge effectively.Comment: 4 pages, presented at the Human in the Loop workshop at ICML 201

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

Improving Missing Data Imputation with Deep Generative Models

Author: Camino Ramiro D.
Hammerschmidt Christian A.
State Radu
Publication venue
Publication date: 27/02/2019
Field of study

Datasets with missing values are very common on industry applications, and they can have a negative impact on machine learning models. Recent studies introduced solutions to the problem of imputing missing values based on deep generative models. Previous experiments with Generative Adversarial Networks and Variational Autoencoders showed interesting results in this domain, but it is not clear which method is preferable for different use cases. The goal of this work is twofold: we present a comparison between missing data imputation solutions based on deep generative models, and we propose improvements over those methodologies. We run our experiments using known real life datasets with different characteristics, removing values at random and reconstructing them with several imputation techniques. Our results show that the presence or absence of categorical variables can alter the selection of the best model, and that some models are more stable than others after similar runs with different random number generator seeds

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

Minority Class Oversampling for Tabular Data with Deep Generative Models

Author: Camino Ramiro
Hammerschmidt Christian
State Radu
Publication venue
Publication date: 07/05/2020
Field of study

In practice, machine learning experts are often confronted with imbalanced data. Without accounting for the imbalance, common classifiers perform poorly and standard evaluation metrics mislead the practitioners on the model's performance. A common method to treat imbalanced datasets is under- and oversampling. In this process, samples are either removed from the majority class or synthetic samples are added to the minority class. In this paper, we follow up on recent developments in deep learning. We take proposals of deep generative models, including our own, and study the ability of these approaches to provide realistic samples that improve performance on imbalanced classification tasks via oversampling. Across 160K+ experiments, we show that all of the new methods tend to perform better than simple baseline methods such as SMOTE, but require different under- and oversampling ratios to do so. Our experiments show that the way the method of sampling does not affect quality, but runtime varies widely. We also observe that the improvements in terms of performance metric, while shown to be significant when ranking the methods, often are minor in absolute terms, especially compared to the required effort. Furthermore, we notice that a large part of the improvement is due to undersampling, not oversampling. We make our code and testing framework available

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

„Gaudeamus igitur?“ – Prevalence and factors of influence on alcohol consumption by students

Author: Hammerschmidt Christian
Publication venue
Publication date: 21/11/2011
Field of study

Ziel der vorliegenden Untersuchung war es, die Konsumgewohnheiten alkoholischer Getränke in einer bislang wenig untersuchten Subpopulation, den Studierenden, zu untersuchen. Zu diesem Zweck wurden Prävalenzen bezüglich der Konsummenge, der Konsummuster, des Rauschtrinkens (Binge-Drinking) sowie alkoholbezogener Störungen, wie Missbrauch oder Abhängigkeit, ermittelt. Des Weiteren wurden spezifische mögliche Einflussfaktoren auf den Alkoholkonsum von Studierenden untersucht. Hierzu wurden demographische Faktoren, die Strukturen des Studiums, insbesondere auch vor dem Hintergrund der Umstellung der Studienabschlüsse gemäß den Bologna-Beschlüsse, sowie allgemeine psychische Belastungen näher betrachtet. Die Erhebung der Daten erfolgte im Jahr 2008 anhand einer Online-Befragung an 2.348 Studierenden dreier niedersächsischer Hochschulen, der TU Braunschweig, der HBK Braunschweig und der Fachhochschule Braunschweig/Wolfenbüttel. Im Anschluss hieran wurde zur Güteabschätzung des Online-Fragebogens und zur spezifischeren Betrachtung pathologischer Trinkmuster und psychischer Auffälligkeiten ein strukturiertes klinisches Interview mit 72 Teilnehmern des Online-Fragebogens durchgeführt. Die Ergebnisse zeigen einen deutlich erhöhten Alkoholkonsum: Nur 10,2% der Studierenden gaben bezogen auf die letzten 30 Tage an, abstinent zu sein, 70,8% tranken risikoarm. 19% konsumierten Alkohol in einem mindestens riskanten Maße. Dieses Ergebnis spiegelt sich auch im Rauschtrinken wider: 34,9% der Studierenden waren Binge-Drinker, weitere 14,5% sog. Heavy User, die sich fünf oder mehr Tage pro Monat rauschmäßig betranken. Nach Auswertung eines Alkoholismus-Screening-Verfahrens (CAGE) betrieben 30,3% mindestens missbräuchlichen, wenn nicht abhängigen Konsum. Die Ergebnisse decken deutlich andere Konsummuster als in der Normalbevölkerung auch gleichen Alters auf. Als Gründe scheinen jedoch weniger internale Bedingungen, wie Coping bei Depressivität oder Ängstlichkeit, sondern vielmehr sozialmotivierte Faktoren eine Rolle zu spielen, die sich zum Teil sogar positiv auf das psychische Wohlbefinden auswirken. Weitere Erklärungsmuster werden abschließend diskutiert.The aim of this paper is to examine the pattern of alcohol use among a sub-group, namely students, which up to this point has been only examined insufficiently. To achieve this, the prevalence of the amount of alcohol consumed, the way in which it is consumed, ‘Binge-drinking’, as well as illnesses resulting from alcohol abuse were determined. In addition, specific possible factors which influence the consumption of alcohol among students were examined. For this, demographic factors, the structure of the course studied, especially against the background of the changes in the university qualifications according to the Bologna agreement, as well as general mental pressures were examined closer. The data was collected in 2008 through an online survey of 2,348 students at three universities in Lower Saxony, namely the Technical University of Braunschweig, the University of Arts in Braunschweig and the University of Applied Sciences Braunschweig/Wolfenbüttel. Finally, an evaluation of the survey was made and an interview with 72 participants was carried out in order to establish specific pathological drinking patterns and mental disturbances. The results show a significant level of alcohol consumption. Over a period of 30 days before the survey, only 10.2% of the students involved claimed not to have drunk any alcohol , 70.8% drank without risk and 19% consumed an amount of alcohol which can be considered detrimental to health. This high level of alcohol consumption is also evident in the high frequency of ‘Binge-drinking’: 34.9% of the participants in the survey were binge-drinkers, a further 14.5% were so called ‘Heavy-users’, who consume a very large amount of alcohol(binge-drinking) on more than five days a month. According to an alcoholism screening (CAGE), 30.3% are alcohol abusers or even alcohol dependent. The results show a deviance in alcohol consumption among students as compared with their peer group in the population and with the population generally. However, an explanation of this high consumption of alcohol seems to depend less on internal conditions, for example coping with depression or anxiety, but much more on socially motivated factors, where alcohol consumption in part has a positive effect on psychological balance. Further explanations for this phenomenon will be discussed

Digitale Bibliothek Braunschweig

flexfringe: A Passive Automaton Learning Package

Author: Verwer Sicco E.
Hammerschmidt Christian
Publication venue
Publication date: 01/07/1995
Field of study

Crossref

Open Repository and Bibliography - Luxembourg

flexfringe: A Passive Automaton Learning Package

Author: Hammerschmidt Christian
Verwer Sicco E.
Publication venue
Publication date: 01/09/2017
Field of study

Crossref

Open Repository and Bibliography - Luxembourg

Acoustic structure of male loud-calls support molecular phylogeny of Sumatran and Javanese leaf monkeys (genus Presbytis)

Author: Hammerschmidt Kurt
Hodges John K
Meyer Dirk
Rinaldi Dones
Roos Christian
Wijaya Ambang
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background The degree to which loud-calls in nonhuman primates can be used as a reliable taxonomic tool is the subject of ongoing debate. A recent study on crested gibbons showed that these species can be well distinguished by their songs; even at the population level the authors found reliable differences. Although there are some further studies on geographic and phylogenetic differences in loud-calls of nonhuman primate species, it is unclear to what extent loud-calls of other species have a similar close relation between acoustic structure, phylogenetic relatedness and geographic distance. We therefore conducted a field survey in 19 locations on Sumatra, Java and the Mentawai islands to record male loud-calls of wild surilis (<it>Presbytis</it>), a genus of Asian leaf monkeys (Colobinae) with disputed taxanomy, and compared the structure of their loud-calls with a molecular genetic analysis. Results The acoustic analysis of 100 surili male loud-calls from 68 wild animals confirms the differentiation of <it>P.potenziani, P.comata, P.thomasi </it>and <it>P.melalophos</it>. In a more detailed acoustic analysis of subspecies of <it>P.melalophos</it>, a further separation of the southern <it>P.m.mitrata </it>confirms the proposed paraphyly of this group. In concordance with their geographic distribution we found the highest correlation between call structure and genetic similarity, and lesser significant correlations between call structure and geographic distance, and genetic similarity and geographic distance. Conclusions In this study we show, that as in crested gibbons, the acoustic structure of surili loud-calls is a reliable tool to distinguish between species and to verify phylogenetic relatedness and migration backgrounds of respective taxa. Since vocal production in other nonhuman primates show similar constraints, it is likely that an acoustic analysis of call structure can help to clarify taxonomic and phylogenetic relationships.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Generating Multi-Categorical Samples with Generative Adversarial Networks

Author: Camino Ramiro Daniel
Hammerschmidt Christian
State Radu
Publication venue
Publication date: 01/07/2018
Field of study

Open Repository and Bibliography - Luxembourg